Learning from errors: Using vector-based compositional semantics for parse reranking
نویسندگان
چکیده
In this paper, we address the problem of how to use semantics to improve syntactic parsing, by using a hybrid reranking method: a k-best list generated by a symbolic parser is reranked based on parsecorrectness scores given by a compositional, connectionist classifier. This classifier uses a recursive neural network to construct vector representations for phrases in a candidate parse tree in order to classify it as syntactically correct or not. Tested on the WSJ23, our method achieved a statistically significant improvement of 0.20% on F-score (2% error reduction) and 0.95% on exact match, compared with the state-ofthe-art Berkeley parser. This result shows that vector-based compositional semantics can be usefully applied in syntactic parsing, and demonstrates the benefits of combining the symbolic and connectionist approaches.
منابع مشابه
An SVM-based voting algorithm with application to parse reranking
This paper introduces a novel Support Vector Machines (SVMs) based voting algorithm for reranking, which provides a way to solve the sequential models indirectly. We have presented a risk formulation under the PAC framework for this voting algorithm. We have applied this algorithm to the parse reranking problem, and achieved labeled recall and precision of 89.4%/89.8% on WSJ section 23 of Penn ...
متن کاملLearning Semantic Parsers Using Statistical Syntactic Parsing Techniques
Most recent work on semantic analysis of natural language has focused on “shallow” semantics such as word-sense disambiguation and semantic role labeling. Our work addresses a more ambitious task we call semantic parsing where natural language sentences are mapped to complete formal meaning representations. We present our system SCISSOR based on a statistical parser that generates a semanticall...
متن کاملExploiting Parse Structures for Native Language Identification
Attempts to profile authors according to their characteristics extracted from textual data, including native language, have drawn attention in recent years, via various machine learning approaches utilising mostly lexical features. Drawing on the idea of contrastive analysis, which postulates that syntactic errors in a text are to some extent influenced by the native language of an author, this...
متن کاملLearning Parse-Free Event-Based Features for Textual Entailment Recognition
We propose new parse-free event-based features to be used in conjunction with lexical, syntactic, and semantic features of texts and hypotheses for Machine Learning-based Recognizing Textual Entailment. Our new similarity features are extracted without using shallow semantic parsers, but still lexical and compositional semantics are not left out. Our experimental results demonstrate that these ...
متن کاملBoosting-based Parse Reranking with Subtree Features
This paper introduces a new application of boosting for parse reranking. Several parsers have been proposed that utilize the all-subtrees representation (e.g., tree kernel and data oriented parsing). This paper argues that such an all-subtrees representation is extremely redundant and a comparable accuracy can be achieved using just a small set of subtrees. We show how the boosting algorithm ca...
متن کامل